Storage system

ABSTRACT

The storage system includes a data dividing means for dividing writing target data into a plurality of units of partial data, and generating units of new divided file data; an index file generation means for generating, for each of the units of partial data, an index entry, and generating index file data by adding test data for error detection; a data writing means for writing the divided file data and the index file data; and a recovery means for detecting an error in the index entries written in the storage device, based on the test data included in each of the index entries. The recovery means deletes an index entry in which an error is detected and all of the subsequent index entries in the index file data stored in the storage device, from the index file data.

TECHNICAL FIELD

The present invention relates to a storage system which divides data tobe stored and stores the data in a storage device.

BACKGROUND ART

In recent years, along with the development and the spread of computers,various kinds of information are put into digital data. As devices forstoring such digital data, storage devices such as a magnetic tape and amagnetic disk have been known. As data to be stored has increased day byday and the amount thereof has become huge, a large-capacity storagesystem is required. Moreover, it is required to keep reliability whilereducing the cost for storage devices. In addition, it is also requiredthat data can be easily retrieved later. As a result, a storage systemcapable of automatically increasing the storage capacity and theperformance thereof and eliminating duplicated storage content to reducethe cost for storage, with high redundancy, is desired.

Under such circumstances, a content address storage system has beendeveloped recently as shown in Patent Document 1. This content addressstorage system distributedly stores data into a plurality of storagedevices, and specifies a storing position where the data is stored basedon a unique content address specified corresponding to the content ofthe data. To be specific, the content address storage system dividespredetermined data into a plurality of fragments, adds a fragment asredundant data thereto, and stores these fragments into a plurality ofstorage devices, respectively.

Later, by designating a content address, it is possible to read data,namely, a fragment, stored in a storing location specified by thecontent address, and recover predetermined data before the division froma plurality of fragments.

Further, as the content address, a hash value of data, which isgenerated so as to be unique corresponding to the content of data, isused. As such, in the case of duplicated data, it is possible to acquiredata of the same content with reference to the data in the same storingposition. Accordingly, it is not necessary to separately storeduplicated data, whereby it is possible to eliminate duplicated recordsand reduce the data capacity.

A storage system having the above-described duplicated recordelimination function includes an upper-level file system and alower-level file system, with the following characteristics:

The upper-level file system divides a written file into a plurality offiles internally.

The divided files are written from the upper-level file system to alower-level file system respectively, and are synchronized with a stablestorage device by the lower-level file system.

The lower-level file system does not ensure the writing sequence of thedata. As such, if system down occurs in the process of data writing, apart of the data might be dropped.

FIG. 1 shows a state where a file F is divided into two by filedivision. First, the upper-level file system generates a file 1 (F1) anda file 2 (F2) by dividing the file F into a plurality of units ofpartial data (F1 _(—)1, F2 _(—)2, etc.), and also generates an indexfile Idx which records mapping information of the original written fileF and the file 1 (F1) and the file 2 (F2) generated by the division. Theindex file Idx has mapping information of each of the divided units ofpartial data (F1 _(—)1, F2 _(—)2, etc.) as an index entry (I-1, etc.).

The mapping information in the index entry mainly includes the followinginformation:

Information of a corresponding file.

Offset information from the head of the file in the file before thedivision.

Offset information from the head of the file in the divided file.

Data size information.

As an example in which a file system that divides a file as describedabove is used, software for data backup has been known. In backupsoftware, backup data is divided into a “data part” and a “marker part”inserted by the backup software, at the upper level of the file system.In general, determination of data deduplication is performed in such amanner that data of a file is sectioned to have a given length (fixedlength or variable length) and that units of the sectioned data arecompared. As such, if there is a difference of data in one file in aspace smaller than the length of the sectioned file, such portions ofdata are not determined to be the same content data. This means thateven if there are portions of data of the same content between thesectioned units of data, if there is a slight difference, both sectionedunits of data are stored, whereby deduplication of data to be storedcannot be performed efficiently. Further, in the software for databackup, there is a case where unique information is inserted for eachbackup such as a backup time, besides the data to be backed up, and sucha marker part is obstructive to the deduplication between respectivefull backups.

Accordingly, as described above, by dividing backup data into a “datapart” and a “marker part” at the upper level of a file system, it ispossible to improve the effect of deduplication of backup data on the“data part” side. In particular, in the case of acquiring full backupsfor several generations, as it is expected that duplicated portions aresignificantly large between respective full backups, it is possible tofurther improve the deduplication function, whereby the storage regioncan be reduced with high efficiency.

Patent Document 1: JP 2005-235171 A

However, in such a file system, if system down occurs in the process ofdata writing, there is a case where each of the divided files becomes anincomplete state, like portions not indicated by reference signs in FIG.2, for example. Particularly, among the divided files, an index file Idxwhich records mapping information of the respective files is animportant file, and if the content thereof becomes incomplete, dataaccessing cannot be performed normally.

Accordingly, an object of the present invention is to provide a storagesystem which solves the above-described problem, that is, a disadvantagethat it becomes impossible to perform data accessing normally in a filesystem.

In order to achieve the above-described object, a storage system, whichis an aspect of the present invention, is configured to include

a data dividing means for dividing data, to be written into a givenstorage device, into a plurality of units of partial data, sorting theunits of the partial data into a plurality of classifications accordingto a predetermined criterion, and for each of the classifications,generating new divided file data by linking the units of the partialdata;

an index file generation means for generating, for each of the units ofthe partial data, an index entry including location information in thedata to be written before division of the units of the partial data andlocation information in the divided file data generated after thedivision of the units of the partial data, adding test data for errordetection to the index entry, and generating index file data by linkinga plurality of the index entries;

a data writing means for writing the divided file data generated by thedata dividing means, and the index file data generated by the index filegeneration means, into the storage device; and

a recovery means for detecting an error in the index entries written inthe storage device, based on the test data included in each of the indexentries, wherein

the recovery means deletes an index entry in which an error is detectedand all of subsequent index entries in the index file data stored in thestorage device, from the index file data.

Further, a program, which is another aspect of the present invention, isa program for causing an information processing device to realize:

a data dividing means for dividing data, to be written into a givenstorage device, into a plurality of units of partial data, sorting theunits of the partial data into a plurality of classifications accordingto a predetermined criterion, and for each of the classifications,generating new divided file data by linking the units of the partialdata;

an index file generation means for generating, for each of the units ofthe partial data, an index entry including location information in thedata to be written before division of the units of the partial data andlocation information in the divided file data generated after thedivision of the units of the partial data, adding test data for errordetection to the index entry, and generating index file data by linkinga plurality of the index entries;

a data writing means for writing the divided file data generated by thedata dividing means, and the index file data generated by the index filegeneration means, into the storage device; and

a recovery means for detecting an error in the index entries written inthe storage device, based on the test data included in each of the indexentries, wherein

the recovery means deletes an index entry in which an error is detectedand all of subsequent index entries in the index file data stored in thestorage device, from the index file data.

Further, an information processing method, which is another aspect ofthe present invention, is configured to include, in an informationprocessing device:

dividing data, to be written into a given storage device, into aplurality of units of partial data, sorting the units of the partialdata into a plurality of classifications according to a predeterminedcriterion, and for each of the classifications, generating new dividedfile data by linking the units of the partial data;

generating, for each of the units of the partial data, an index entryincluding location information in the data to be written before divisionof the units of the partial data and location information in the dividedfile data generated after the division of the units of the partial data,adding test data for error detection to the index entry, and generatingindex file data by linking a plurality of the index entries;

writing the divided file data and the index file data into the storagedevice; and

detecting an error in the index entries written in the storage device,based on the test data included in each of the index entries, anddeleting an index entry in which an error is detected and all ofsubsequent index entries in the index file data stored in the storagedevice, from the index file data.

As the present invention is configured as described above, even if datawritten in a storage device becomes incomplete due to system down or thelike, subsequent data accessing can be performed normally.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 shows a state where a file to be written into a storage device isdivided.

FIG. 2 shows a state where divided files written in the storage devicebecome incomplete.

FIG. 3 shows a configuration of a storage system according to a firstexemplary embodiment of the present invention.

FIG. 4 shows an example of the file information table disclosed in FIG.3.

FIG. 5 is a flowchart showing an operation of the storage systemdisclosed in FIG. 3.

FIG. 6 is a flowchart showing an operation of the storage systemdisclosed in FIG. 3.

FIG. 7 is a flowchart showing an operation of the storage systemdisclosed in FIG. 3.

FIG. 8 is a flowchart showing an operation of the storage systemdisclosed in FIG. 3.

FIG. 9 is a flowchart showing an operation of the storage systemdisclosed in FIG. 3.

FIG. 10 shows a state where an index file is modified in the storagesystem disclosed in FIG. 3.

FIG. 11 shows a state where a divided file is modified in the storagesystem disclosed in FIG. 3.

FIG. 12 shows a state where a divided file is modified in the storagesystem disclosed in FIG. 3.

FIG. 13 shows a configuration of a storage system according tosupplementary note 1 of the present invention.

EXEMPLARY EMBODIMENTS First Exemplary Embodiment

A first exemplary embodiment of the present invention will be describedwith reference to FIGS. 3 to 12. FIGS. 3 and 4 are drawings forexplaining a configuration of a storage system according to the presentembodiment, and FIGS. 5 to 12 are drawings for explaining the operationof the storage system.

[Configuration]

A storage system 1 of the present invention is configured of one servercomputer or a plurality of server computers connected with each other.As shown in FIG. 2, the storage system 1 includes two file systems,namely a file system A and a file system B. The file system A has, forexample, a function of controlling writing and reading operations of thestorage system 1 itself, and the file system B has a function ofactually storing data in a storage device.

It should be noted that the storage system 1 of the present embodimentis a content address storage system which divides data and makes itredundant, stores the divided units of data into a plurality of storagedevices distributedly, and specifies the stored location where the datais stored according to a unique content address to be set correspondingto the content of the data to be stored. Thereby, the storage system 1realizes deduplication of data to be stored. However, the storage system1 of the present invention is not limited to a content address storagesystem, and not limited to that having a deduplication function.

The storage system 1 of the present embodiment includes, in the filesystem A, a data attribute determination section 11, a file dividingsection 12, an index file generation section 13, a data writing section14, and a recovery section 15, which are configured by a program beingincorporated in an arithmetic unit. The storage system 1 also includes afile information table 16 formed in the main memory.

Although not shown, the storage system 1 also includes a plurality ofstorage devices which are accessible by the file system B. The storagesystem 1 has a function of further dividing the divided files F1 and F2and the index file Idx which will be described below, making themredundant, storing them distributedly in a plurality of storage devices,and realizing deduplication.

The data attribute determination section 11 (data dividing means)determines which of the predetermined attributes (classifications) eachunit of partial data in a writing target file (writing target data)belongs to. In the present embodiment, a file F which is a writingtarget is backup data, for example, and the data attribute determinationsection 11 determines that the data belongs to which of the twoattributes, namely a “data part” which is the actual data part of thebackup data and in which the values are not changed due to the generatedtime, the number of updates, and the like, and a “marker part” in whichthe values are changed due to the time or the number of updates such astime stamps or serial numbers and which includes management informationof the file itself. It should be noted that in the data attributedetermination section 11, reference information for determining theattribute has been set in advance from the data content of each unit ofpartial data in the file F, and the attribute is determined inaccordance with such reference information. The file dividing section 12(data dividing means) divides the data of the file F into respectiveunits of partial data according to the attributes determined by the dataattribute determination section 11, sorts the units of partial data byrespective attributes, and newly generates divided file datarespectively. For example, in the present embodiment, the units ofpartial data belonging to the “data part” in the file F are sorted intothe file 1 (F1) which is divided file data after the division, and theunits of partial data belonging to the “marker part” are sorted into thefile 2 (F2) which is divided file data after the division. Then, thesorted units of partial data are combined by each file (File 1, File 2)corresponding to each attribute. More specifically, as shown in FIG. 1,in the file F which is a writing target, the units of partial data F1_(—)1 to F1 _(—)7 are sorted into the file 1 and the units of partialdata F2 _(—)1 to F2 _(—)6 are sorted into the file 2.

Here, the processing of dividing the file F and generating the files 1and 2 as described above is performed on the main memory in the storagesystem 1, and the files 1 and 2 are actually written in a storage deviceat the time of data synchronization of the file system A and the filesystem B by the data writing section 14, as described below.

While the present embodiment exemplary shows the case of dividing thewriting target file F into two files, the present invention is notlimited to the case where the number of files generated by division istwo, and is applicable to the case where a file is divided in to alarger number of files.

When dividing the file of the file F into respective units of partialdata as described above, the index file generation section 13 (indexfile generation means) generates index entries for respective units ofpartial data and links them to thereby generate an index file Idx (indexfile data). It should be noted that an index entry is generated usinginformation stored in the file information table 16 for example, and asshown in FIG. 4, includes data such as “originalFile_offset”representing location information in the file F before the division of aunit of partial data corresponding to the index entry, “fileA_offset” or“fileB_offset” (offset information of a file described in“current_File”) representing location information in the divided filedata (file 1 or file 2) generated from the partial data, “data size”which is data size information representing the data size of the partialdata itself, and “index_sync” representing whether or notsynchronization with the file system B has been completed. It should benoted that while the initial set value of “index_sync” is “0”, whensynchronization with the file system B has been completed, the value isset to “1”.

The index file generation section 13 also adds, to each of the indexentries, test data for error detection to be used for detectingirregularities in the index entries. This test data is a redundant codesuch as “CRC32” for example, but the test data is not limited to suchdata.

The data writing section 14 (data writing means) writes the file 1 andthe file 2 which are divided file data generated by the file dividingsection 12, and the index file Idx generated by the index filegeneration section 13, into the file system B. Specifically, the datawriting section 14 actually writes the files 1 and 2 and the index fileIdx generated on the main memory in the storage system 1 into anauxiliary storage device at the time of data synchronization between thefile system A and the file system B. In particular, when writing of theindex entries into the auxiliary storage device has been completed, thedata writing section 14 sets “index_sync” to “1”, and adds specificinformation.

During the time when the data writing section 14 writes the file 1, file2, and the index file Idx into the auxiliary storage device, if systemdown occurs because of occurrence of a failure in the storage system 1itself or in the file system B, the recovery section 15 (recovery means)performs recovery processing such as checking and restoration of thedata when the data written in the auxiliary storage device is accessednext time.

To be specific, the recovery section 15 performs error detectionprocessing by checking the test data for error detection stored in eachof the index entries in the index file Idx. If the recovery section 15detects an error in any index entry, the recovery section 15 performsmodification to delete, from the index file Ids, such an index entry andall of the subsequent index entries located closer to the end from suchan index entry. In this process, the recovery section 15 performs errordetection processing of index entries in sequence from the end to thebeginning of the index file Idx, and also performs back read to checkthe value of the “index_sync” in each of the index entries. If the“index_sync” in any of the index entries is “1”, the recovery section 15ends the error detection processing performed in sequence from the endof the index file Idx, that is, back read. It should be noted that therecovery section 15 also ends the back read when the reading reaches thefirst entry of the index file Idx during the back read.

Further, upon completion of checking and modification of the respectiveindex entries as described above, the recovery section 15 specifies thefile sizes of the file 1 and the file 2 which are divided files, fromthe information in the index entry located at the end of the index fileIdx after the modification. Then, the recovery section 15 checks whetherthe specified file sizes of the specified files 1 and 2 and the filesizes of the actual files 1 and 2 conform with each other, and extendsor deletes the end of the files 1 and 2 in order that the actual filesizes of the file 1 and the file 2 conform to the file sizes specifiedfrom the index entry. The specific content of the processing will bedescribed below.

[Operation]

Next, operation of the storage system 1 will be described with referenceto the flowcharts of FIGS. 5 to 9 and FIGS. 10 to 12.

First, with reference to FIG. 5, file division and generation of anindex file by the storage system 1 will be described. When writing thefile F into the file system A, the storages system 1 initializes thefile information table 16 (step S1) and generates a file, and writesvarious types of information such as inode number and the like withrespect to the file as a header in the index file Idx (step S2). Then,when the file F is written (step S3), the data attribute determinationsection 11 checks the data attribute of each of the units of partialdata, and writes the data attribute on “current_File” in the fileinformation table 16 (step S4), and the file dividing section 12 writesrespective units of partial data into the respective files 1 and 2corresponding to the data attributes (step S5).

Then, each time the storage system 1 writes data from the file F (stepS6), the storage system 1 determines the data attribute (step S7) andstarts writing of the data (step S11). In this process, if the attributeof a unit of partial data determined by the data attribute determinationdiffers from that of a unit of partial data immediately before (No atstep S8), the index file generation section 13 generates an index entry.The storage system 1 writes the index entry on the index file Idx (stepS9), and updates “current_File” (step S10).

Then, when writing of all units of partial data in the file F hascompleted (step S11, Yes at step S12), the storage system 1 finallywrites the index entry (step S13). Such writing is performed on the mainmemory, and the data is actually stored in an auxiliary storage deviceat the time of synchronization to be performed later.

Next, data writing processing at step S5 and step S11, as shown in FIG.5, will be described with reference to the flowchart of FIG. 6. The filedividing section 12 writes, in accordance with “current_File” in thefile information table 16 (step S21), partial data of the file F intothe file 1 or the file 2 of the file system B (steps S22, S23), and addsthe size of the written partial data to the “data_size” in the fileinformation table 16 (step S24).

Next, index entry writing processing at step S9, as shown in FIG. 5,will be described with reference to the flowchart of FIG. 7. Asdescribed above, if the attribute of a unit of partial data of the fileF is changed, an index entry is written. In this process, “current_File”in the file information table 16 is checked (step S31), and based on therespective pieces of information in the file information table 16, aredundant code for testing is calculated (steps S32, S35). Then, therespective pieces of information and the redundant code for testing arewritten as one index entry on the index file Idx of the file system B(step S33, S34). After the index entry is written on the index file Idx,“data_size” is added to “fileA_offset” or “fileB_offset (offsetinformation of the file described in “current_File”) and“originalFile_offset” (steps S34, S37, and S38), and “index_sync” is setto 0 (step S39).

Next, data synchronization processing when a data synchronizationcommand is issued to the file system A, that is, an operation ofactually writing data of the files 1 and 2 and the index file Idx,generated as described above, from the main memory to an auxiliarystorage device, will be described with reference to the flowchart ofFIG. 8.

When a data synchronization command is executed (step S41), indexentries are written (step S42), and a data synchronization command isissued with respect to all of the files of the file system B (step S43).Finally, “index_sync” in the file information table 16 is set to “1”(step S44). Thereby, “index_sync” in the index entry generatedimmediately after the data synchronization is “1”, while the others are“0”.

If system down occurs in the process of data writing to the file systemB during the data synchronization, recovery processing will be performedat the time of the next access to the file F. This recovery processingwill be described with reference to the flowchart of FIG. 9 and FIGS. 10to 12.

In the recovery processing (step S51), as shown in FIG. 10, the indexentries in the index file Idx are read back from the end of the indexfile Idx (step S52). Then, with use of the redundant code for testingeach of the index entries, it is checked whether there is any incorrectindex entry (step S53). In this process, it is also checked whether“index_sync” in an index entry is “1”, or the back read reaches the headof the index file Idx (step S54). If the reading does not reach thehead, the previous entry is read (step S55), and the same processing isperformed.

During the back read, if “index_sync” in an index entry is “1” or theback read reaches the head of the index file Idx, the back read is ended(step S54). In this process, if there is any incorrect index entry, allof the subsequent index entries, namely the index entries from theincorrect one to the end of the index file Idx, are deleted (step S56).For example, if “index_sync” of the index entry of the reference signI_(—)9 shown in FIG. 10 is “1” and the index entry of the right sidethereof, which is closer to the end, is incorrect, the index entries ofthe dotted portions located closer to the end from the index entry ofthe reference sign I_(—)9 are deleted as shown by an arrow in FIG. 10.

It should be noted that if “index_sync” in an index entry is “1”, it isensured that respective files before such index entry (file 1, file 2,index file) are synchronized. Accordingly, there is no need to checkwhether or not the index entries before it are incorrect, and there isno need to perform back read. As described above, by adding “index_sync”to the index entries, it is possible to shorten the back read section.

Then, when checking and modification of the index entries have beencompleted, it is checked whether there is any difference between thedata size of each of the files 1 and 2 and the end of the regionindicated by the normal last index entry corresponding to each of thefiles 1 and 2 (step S57). For example, the size of the file 1 isspecified based on the location information in the file 1 and the datasize included in the index entry of the reference sign I_(—)9 located atthe end of the normal portion of the index file shown in FIG. 11, and itis compared with the size of the actual file 1. If the size of theactual file 1 is larger than the size specified from the last indexentry, incomplete data located after the end shown by the reference signF1-5 of the actual file 1 corresponding to the last index entry isdeleted to thereby cut the file 1. This means that the partial datashown by the dotted lines in FIG. 11, which is the end portion of thefile 1, is deleted as shown by an arrow up to the size specified fromthe last index entry (step S58).

On the other hand, if the size of the actual file 1 is smaller than thesize specified from the last index entry, the end of the file 1 isextended up to the size specified by the normal last index entry (stepS59).

Then, with respect to the file 2, it is also checked whether there isany difference between the data size of the file 2 and the end of theregion indicated by the normal last index entry corresponding to thefile 2, in a similar manner (step S60). For example, the size of thefile 2 is specified based on the location information in the file 2 andthe data size included in the index entry of the reference sign I_(—)8located at the end of the normal portion of the index file correspondingto the file 2 shown in FIG. 11, and it is compared with the size of theactual file 2. If the size of the actual file 2 is larger than the sizespecified from the last index entry, incomplete data located after theend of the actual file 2 corresponding to the last index entry isdeleted to thereby cut the file 2 (step S61).

On the other hand, if the size of the actual file 2 is smaller than thesize specified from the last index entry (reference sign I_(—)8)corresponding to the file 2, the end of the file 2 is extended up to thesize specified by the normal last index entry as shown by the dottedlines and an arrow in FIG. 12 (step S62).

As the storage system of the present invention is configured asdescribed above, even if system down occurs during writing, consistencybetween files of the file 1, the file 2, and the index file can bemaintained, whereby the next access to the corresponding file can beperformed normally.

<Supplementary Notes>

The whole or part of the exemplary embodiments disclosed above can bedescribed as, but not limited to, the following supplementary notes.Hereinafter, the outlines of the configuration of a storage systemaccording to the present invention will be described with reference toFIG. 13. However, the present invention is not limited to theconfiguration described below.

(Supplementary Note 1)

A storage system 100 comprising:

a data dividing means 101 for dividing data, to be written into a givenstorage device, into a plurality of units of partial data, sorting theunits of the partial data into a plurality of classifications accordingto a predetermined criterion, and for each of the classifications,generating new divided file data by linking the units of the partialdata;

an index file generation means 102 for generating, for each of the unitsof the partial data, an index entry including location information inthe data to be written before division of the units of the partial dataand location information in the divided file data generated after thedivision of the units of the partial data, adding test data for errordetection to the index entry, and generating index file data by linkinga plurality of the index entries;

a data writing means 103 for writing the divided file data generated bythe data dividing means, and the index file data generated by the indexfile generation means, into the storage device; and

a recovery means 104 for detecting an error in the index entries writtenin the storage device, based on the test data included in each of theindex entries, wherein

the recovery means deletes an index entry in which an error is detectedand all of subsequent index entries in the index file data stored in thestorage device, from the index file data.

(Supplementary Note 2)

The storage system, according to supplementary note 1, wherein

the recovery means performs error detection processing on the indexentries in the index file data stored in the storage device, in sequencefrom the end of the index file data.

(Supplementary Note 3)

The storage system, according to supplementary note 2, wherein

the data writing means stores, in the storage device, an index entryhaving been written in the storage device, among the index entries inthe index file data, while adding specific information to the indexentry, and

when the recovery means performs the error detection processing on theindex entries in sequence from the end of the index file data, if thespecific information is added to any of the index entries, the recoverymeans stops the error detection processing performed on the indexentries.

(Supplementary Note 4)

The storage system, according to any of supplementary notes 1 to 3,wherein the index file generation means allows each of the index entriesto include data size information representing a data size of a unit ofthe partial data corresponding to the index entry, and

the recovery means modifies a file size of the divided file data basedon information included in an index entry located at the end of theindex file data having been recovered after deletion of the index entryin which an error was detected and all of the subsequent index entries.

(Supplementary Note 5)

The storage system, according to supplementary note 4, wherein

the recovery means extends or deletes the end of the divided file datasuch that the file size of the divided file data conforms to a filesize, the file size being information included in the index entrylocated at the end of the recovered index file data, and being specifiedby location information in the divided file data in which the unit ofthe partial data corresponding to the index entry is included, and thedata size information of the unit of the partial data.

(Supplementary Note 6)

A program for causing an information processing device to realize:

a data dividing means for dividing data, to be written into a givenstorage device, into a plurality of units of partial data, sorting theunits of the partial data into a plurality of classifications accordingto a predetermined criterion, and for each of the classifications,generating new divided file data by linking the units of the partialdata;

an index file generation means for generating, for each of the units ofthe partial data, an index entry including location information in thedata to be written before division of the units of the partial data andlocation information in the divided file data generated after thedivision of the units of the partial data, adding test data for errordetection to the index entry, and generating index file data by linkinga plurality of the index entries;

a data writing means for writing the divided file data generated by thedata dividing means, and the index file data generated by the index filegeneration means, into the storage device; and

a recovery means for detecting an error in the index entries written inthe storage device, based on the test data included in each of the indexentries, wherein

the recovery means deletes an index entry in which an error is detectedand all of subsequent index entries in the index file data stored in thestorage device, from the index file data.

(Supplementary Note 7)

The program, according to supplementary note 6, wherein

the recovery means performs error detection processing on the indexentries in the index file data stored in the storage device, in sequencefrom the end of the index file data.

(Supplementary Note 8)

The program, according to supplementary note 7, wherein

the data writing means stores, in the storage device, an index entryhaving been written in the storage device, among the index entries inthe index file data, while adding specific information to the indexentry, and

when the recovery means performs the error detection processing on theindex entries in sequence from the end of the index file data, if thespecific information is added to any of the index entries, the recoverymeans stops the error detection processing performed on the indexentries.

(Supplementary Note 9)

An information processing method comprising, in an informationprocessing device:

dividing data, to be written into a given storage device, into aplurality of units of partial data, sorting the units of the partialdata into a plurality of classifications according to a predeterminedcriterion, and for each of the classifications, generating new dividedfile data by linking the units of the partial data;

generating, for each of the units of the partial data, an index entryincluding location information in the data to be written before divisionof the units of the partial data and location information in the dividedfile data generated after the division of the units of the partial data,adding test data for error detection to the index entry, and generatingindex file data by linking a plurality of the index entries;

writing the divided file data and the index file data into the storagedevice; and

detecting an error in the index entries written in the storage device,based on the test data included in each of the index entries, anddeleting an index entry in which an error is detected and all ofsubsequent index entries in the index file data stored in the storagedevice, from the index file data.

(Supplementary Note 10)

The information processing method, according to supplementary note 9,further comprising

performing error detection processing on the index entries in the indexfile data stored in the storage device, in sequence from the end of theindex file data.

(Supplementary Note 11)

The information processing method, according to supplementary note 10,wherein

the writing the data includes storing, in the storage device, an indexentry having been written in the storage device, among the index entriesin the index file data, while adding specific information to the indexentry, and

the performing the error detection processing on the index entries insequence from the end of the index file data includes, if the specificinformation is added to any of the index entries, stopping the errordetection processing performed on the index entries.

It should be noted that in the above-described exemplary embodiments, aprogram may be stored in a storage device or in a computer readablerecording medium. For example, a recording medium is a portable mediumsuch as a flexible disk, an optical disk, a magneto-optical disk, asemiconductor memory, or the like.

While the present invention has been described with reference to theexemplary embodiments described above, the present invention is notlimited to the above-described embodiments. The form and details of thepresent invention can be changed within the scope of the presentinvention in various manners that can be understood by those skilled inthe art.

The present invention is based upon and claims the benefit of priorityfrom Japanese patent application No. 2011-16229, filed on Jan. 28, 2011,the disclosure of which is incorporated herein in its entirety byreference.

DESCRIPTION OF REFERENCE NUMERALS

-   1 storage system-   11 data attribute determination section-   12 file dividing section-   13 index file generation section-   14 data writing section-   15 recovery section-   16 file information table-   100 storage system-   101 data dividing means-   102 index file generation means-   103 data writing means-   104 recovery means-   F file (writing target data)-   F1 file 1 (divided file data)-   F2 file 2 (divided file data)-   Idx index file

1. A storage system comprising: a data dividing unit that divides data,to be written into a given storage device, into a plurality of units ofpartial data, sorts the units of the partial data into a plurality ofclassifications according to a predetermined criterion, and for each ofthe classifications, generates new divided file data by linking theunits of the partial data; an index file generation unit that generates,for each of the units of the partial data, an index entry includinglocation information in the data to be written before division of theunits of the partial data and location information in the divided filedata generated after the division of the units of the partial data, addstest data for error detection to the index entry, and generates indexfile data by linking a plurality of the index entries; a data writingunit that writes the divided file data generated by the data dividingunit, and the index file data generated by the index file generationunit, into the storage device; and a recovery unit that detects an errorin the index entries written in the storage device, based on the testdata included in each of the index entries, wherein the recovery unitdeletes an index entry in which an error is detected and all ofsubsequent index entries in the index file data stored in the storagedevice, from the index file data.
 2. The storage system, according toclaim 1, wherein the recovery unit performs error detection processingon the index entries in the index file data stored in the storagedevice, in sequence from the end of the index file data.
 3. The storagesystem, according to claim 2, wherein the data writing unit stores, inthe storage device, an index entry having been written in the storagedevice, among the index entries in the index file data, while addingspecific information to the index entry, and when the recovery unitperforms the error detection processing on the index entries in sequencefrom the end of the index file data, if the specific information isadded to any of the index entries, the recovery unit stops the errordetection processing of the index entries.
 4. The storage system,according to claim 1, wherein the index file generation unit allows eachof the index entries to include data size information representing adata size of a unit of the partial data corresponding to the indexentry, and the recovery unit modifies a file size of the divided filedata based on information included in an index entry located at the endof the index file data having been recovered after deletion of the indexentry in which an error was detected and all of the subsequent indexentries.
 5. The storage system, according to claim 4, wherein therecovery unit extends or deletes the end of the divided file data suchthat the file size of the divided file data conforms to a file size, thefile size being information included in the index entry located at theend of the recovered index file data, and being specified by locationinformation in the divided file data in which the unit of the partialdata corresponding to the index entry is included, and the data sizeinformation of the unit of the partial data.
 6. A non-transitorycomputer-readable medium storing a program comprising instructions forcausing an information processing device to realize: a data dividingunit that divides data, to be written into a given storage device, intoa plurality of units of partial data, sorts the units of the partialdata into a plurality of classifications according to a predeterminedcriterion, and for each of the classifications, generates new dividedfile data by linking the units of the partial data; an index filegeneration unit that generates, for each of the units of the partialdata, an index entry including location information in the data to bewritten before division of the units of the partial data and locationinformation in the divided file data generated after the division of theunits of the partial data, adds test data for error detection to theindex entry, and generates index file data by linking a plurality of theindex entries; a data writing unit that writes the divided file datagenerated by the data dividing unit, and the index file data generatedby the index file generation unit, into the storage device; and arecovery unit that detects an error in the index entries written in thestorage device, based on the test data included in each of the indexentries, wherein the recovery unit deletes an index entry in which anerror is detected and all of subsequent index entries in the index filedata stored in the storage device, from the index file data.
 7. Thenon-transitory computer-readable medium storing the program, accordingto claim 6, wherein the recovery unit performs error detectionprocessing on the index entries in the index file data stored in thestorage device, in sequence from the end of the index file data.
 8. Aninformation processing method comprising, in an information processingdevice: dividing data, to be written into a given storage device, into aplurality of units of partial data, sorting the units of the partialdata into a plurality of classifications according to a predeterminedcriterion, and for each of the classifications, generating new dividedfile data by linking the units of the partial data; generating, for eachof the units of the partial data, an index entry including locationinformation in the data to be written before division of the units ofthe partial data and location information in the divided file datagenerated after the division of the units of the partial data, addingtest data for error detection to the index entry, and generating indexfile data by linking a plurality of the index entries; writing thedivided file data and the index file data into the storage device; anddetecting an error in the index entries written in the storage device,based on the test data included in each of the index entries, anddeleting an index entry in which an error is detected and all ofsubsequent index entries in the index file data stored in the storagedevice, from the index file data.
 9. The information processing method,according to claim 8, further comprising performing error detectionprocessing on the index entries in the index file data stored in thestorage device, in sequence from the end of the index file data.